智能论文笔记

Physics-informed Deep Super-resolution for Spatiotemporal Data

Pu Ren , Chengping Rao , Yang Liu , Zihan Ma , Qi Wang , Jian-Xun Wang , Hao Sun

分类：机器学习

2022-08-02

复杂物理系统的高保真模拟在时空尺度上昂贵且无法访问。最近，人们对利用深度学习来增强基于粗粒的模拟来增强科学数据的兴趣越来越大，这是廉价的计算费用，并保留了令人满意的解决方案精度。但是，现有的主要工作集中在数据驱动的方法上，这些方法依赖丰富的培训数据集并缺乏足够的身体约束。为此，我们提出了一个通过物理知识学习的新颖而有效的时空超分辨率框架，灵感来自部分微分方程（PDES）中的时间和空间衍生物之间的独立性。一般原则是利用时间插值来进行流量估计，然后引入卷积转递的神经网络以学习时间细化。此外，我们采用了具有较大激活的堆叠残留块，并带有像素舍式的子像素层进行空间重建，其中特征提取是在低分辨率的潜在潜在空间中进行的。此外，我们考虑在网络中严重施加边界条件以提高重建精度。结果表明，通过广泛的数值实验，与基线算法相比，该方法的卓越有效性和效率。

translated by 谷歌翻译

TwiBot-22: Towards Graph-Based Twitter Bot Detection

Shangbin Feng , Zhaoxuan Tan , Herun Wan , Ningnan Wang , Zilong Chen , Binchi Zhang , Qinghua Zheng , Wenqian Zhang , Zhenyu Lei , Shujie Yang

分类：人工智能

2022-06-09

Twitter机器人检测已成为打击错误信息，促进社交媒体节制并保持在线话语的完整性的越来越重要的任务。最先进的机器人检测方法通常利用Twitter网络的图形结构，在面对传统方法无法检测到的新型Twitter机器人时，它们表现出令人鼓舞的性能。但是，现有的Twitter机器人检测数据集很少是基于图形的，即使这些基于图形的数据集也遭受有限的数据集量表，不完整的图形结构以及低注释质量。实际上，缺乏解决这些问题的大规模基于图的Twitter机器人检测基准，严重阻碍了基于图形的机器人检测方法的开发和评估。在本文中，我们提出了Twibot-22，这是一个综合基于图的Twitter机器人检测基准，它显示了迄今为止最大的数据集，在Twitter网络上提供了多元化的实体和关系，并且与现有数据集相比具有更好的注释质量。此外，我们重新实施35代表性的Twitter机器人检测基线，并在包括Twibot-22在内的9个数据集上进行评估，以促进对模型性能和对研究进度的整体了解的公平比较。为了促进进一步的研究，我们将所有实施的代码和数据集巩固到Twibot-22评估框架中，研究人员可以在其中始终如一地评估新的模型和数据集。 Twibot-22 Twitter机器人检测基准和评估框架可在https://twibot22.github.io/上公开获得。

translated by 谷歌翻译

Federated Two-stage Learning with Sign-based Voting

Zichen Ma , Zihan Lu , Yu Lu , Wenye Li , Jinfeng Yi , Shuguang Cui

分类：机器学习

2021-12-10

联合学习是一个分布式机器学习机制，本地设备在中央服务器的编排中协作培训共享全局模型，同时保留所有私有数据分散。在系统中，传输模型参数及其更新而不是原始数据，因此通信瓶颈已成为一个关键挑战。此外，近期的较大和更深层次的机器学习模型也在将它们部署到联邦环境中的困难造成更多困难。在本文中，我们设计了一个联合的两阶段学习框架，即在设备上使用切割层增强了原型联合学习，并使用基于符号的随机梯度下降与大多数投票方法进行模型更新。剪切图层在设备上学习本地原始数据的信息和低维表示，有助于减少全局模型参数并防止数据泄漏。基于符号的SGD与大多数投票方式进行模型更新，也有助于缓解通信限制。凭经验，我们表明我们的系统是一种有效和隐私，保留联合学习计划和适用于一般应用方案的诉讼。

translated by 谷歌翻译

Esports Data-to-commentary Generation on Large-scale Data-to-text Dataset

Zihan Wang , Naoki Yoshinaga

分类：自然语言处理

2022-12-21

Esports, a sports competition using video games, has become one of the most important sporting events in recent years. Although the amount of esports data is increasing than ever, only a small fraction of those data accompanies text commentaries for the audience to retrieve and understand the plays. Therefore, in this study, we introduce a task of generating game commentaries from structured data records to address the problem. We first build a large-scale esports data-to-text dataset using structured data and commentaries from a popular esports game, League of Legends. On this dataset, we devise several data preprocessing methods including linearization and data splitting to augment its quality. We then introduce several baseline encoder-decoder models and propose a hierarchical model to generate game commentaries. Considering the characteristics of esports commentaries, we design evaluation metrics including three aspects of the output: correctness, fluency, and strategic depth. Experimental results on our large-scale esports dataset confirmed the advantage of the hierarchical model, and the results revealed several challenges of this novel task.

translated by 谷歌翻译

In-Sensor & Neuromorphic Computing are all you need for Energy Efficient Computer Vision

Gourav Datta , Zeyu Liu , Md Abdullah-Al Kaiser , Souvik Kundu , Joe Mathai , Zihan Yin , Ajey P. Jacob , Akhilesh R. Jaiswal , Peter A. Beerel

分类：计算机视觉

2022-12-21

Due to the high activation sparsity and use of accumulates (AC) instead of expensive multiply-and-accumulates (MAC), neuromorphic spiking neural networks (SNNs) have emerged as a promising low-power alternative to traditional DNNs for several computer vision (CV) applications. However, most existing SNNs require multiple time steps for acceptable inference accuracy, hindering real-time deployment and increasing spiking activity and, consequently, energy consumption. Recent works proposed direct encoding that directly feeds the analog pixel values in the first layer of the SNN in order to significantly reduce the number of time steps. Although the overhead for the first layer MACs with direct encoding is negligible for deep SNNs and the CV processing is efficient using SNNs, the data transfer between the image sensors and the downstream processing costs significant bandwidth and may dominate the total energy. To mitigate this concern, we propose an in-sensor computing hardware-software co-design framework for SNNs targeting image recognition tasks. Our approach reduces the bandwidth between sensing and processing by 12-96x and the resulting total energy by 2.32x compared to traditional CV processing, with a 3.8% reduction in accuracy on ImageNet.

translated by 谷歌翻译

Reconstructing Training Data from Model Gradient, Provably

Zihan Wang , Jason Lee , Qi Lei

分类：机器学习 | (统计)机器学习

2022-12-07

Understanding when and how much a model gradient leaks information about the training sample is an important question in privacy. In this paper, we present a surprising result: even without training or memorizing the data, we can fully reconstruct the training samples from a single gradient query at a randomly chosen parameter value. We prove the identifiability of the training data under mild conditions: with shallow or deep neural networks and a wide range of activation functions. We also present a statistically and computationally efficient algorithm based on tensor decomposition to reconstruct the training data. As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy, especially in federated learning.

translated by 谷歌翻译

RHO ($ρ$): Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding

Ziwei Ji , Zihan Liu , Nayeon Lee , Tiezheng Yu , Bryan Wilie , Min Zeng , Pascale Fung

分类：自然语言处理 | 人工智能

2022-12-03

Dialogue systems can leverage large pre-trained language models and knowledge to generate fluent and informative responses. However, these models are still prone to produce hallucinated responses not supported by the input source, which greatly hinders their application. The heterogeneity between external knowledge and dialogue context challenges representation learning and source integration, and further contributes to unfaithfulness. To handle this challenge and generate more faithful responses, this paper presents RHO ($\rho$) utilizing the representations of linked entities and relation predicates from a knowledge graph (KG). We propose (1) local knowledge grounding to combine textual embeddings with the corresponding KG embeddings; and (2) global knowledge grounding to equip RHO with multi-hop reasoning abilities via the attention mechanism. In addition, we devise a response re-ranking technique based on walks over KG sub-graphs for better conversational reasoning. Experimental results on OpenDialKG show that our approach significantly outperforms state-of-the-art methods on both automatic and human evaluation by a large margin, especially in hallucination reduction (17.54% in FeQA).

translated by 谷歌翻译

Imperceptible Adversarial Attack via Invertible Neural Networks

Zihan Chen , Ziyue Wang , Junjie Huang , Wentao Zhao , Xiao Liu , Dejian Guan

分类：计算机视觉

2022-11-28

Adding perturbations via utilizing auxiliary gradient information or discarding existing details of the benign images are two common approaches for generating adversarial examples. Though visual imperceptibility is the desired property of adversarial examples, conventional adversarial attacks still generate traceable adversarial perturbations. In this paper, we introduce a novel Adversarial Attack via Invertible Neural Networks (AdvINN) method to produce robust and imperceptible adversarial examples. Specifically, AdvINN fully takes advantage of the information preservation property of Invertible Neural Networks and thereby generates adversarial examples by simultaneously adding class-specific semantic information of the target class and dropping discriminant information of the original class. Extensive experiments on CIFAR-10, CIFAR-100, and ImageNet-1K demonstrate that the proposed AdvINN method can produce less imperceptible adversarial images than the state-of-the-art methods and AdvINN yields more robust adversarial examples with high confidence compared to other adversarial attacks.

translated by 谷歌翻译

Regret Bounds for Noise-Free Cascaded Kernelized Bandits

Zihan Li , Jonathan Scarlett

分类： (统计)机器学习 | 机器学习

2022-11-10

We consider optimizing a function network in the noise-free grey-box setting with RKHS function classes, where the exact intermediate results are observable. We assume that the structure of the network is known (but not the underlying functions comprising it), and we study three types of structures: (1) chain: a cascade of scalar-valued functions, (2) multi-output chain: a cascade of vector-valued functions, and (3) feed-forward network: a fully connected feed-forward network of scalar-valued functions. We propose a sequential upper confidence bound based algorithm GPN-UCB along with a general theoretical upper bound on the cumulative regret. For the Mat\'ern kernel, we additionally propose a non-adaptive sampling based method along with its theoretical upper bound on the simple regret. We also provide algorithm-independent lower bounds on the simple regret and cumulative regret, showing that GPN-UCB is near-optimal for chains and multi-output chains in broad cases of interest.

translated by 谷歌翻译

Cross-Domain Local Characteristic Enhanced Deepfake Video Detection

Zihan Liu , Hanyi Wang , Shilin Wang

分类：计算机视觉

2022-11-07

As ultra-realistic face forgery techniques emerge, deepfake detection has attracted increasing attention due to security concerns. Many detectors cannot achieve accurate results when detecting unseen manipulations despite excellent performance on known forgeries. In this paper, we are motivated by the observation that the discrepancies between real and fake videos are extremely subtle and localized, and inconsistencies or irregularities can exist in some critical facial regions across various information domains. To this end, we propose a novel pipeline, Cross-Domain Local Forensics (XDLF), for more general deepfake video detection. In the proposed pipeline, a specialized framework is presented to simultaneously exploit local forgery patterns from space, frequency, and time domains, thus learning cross-domain features to detect forgeries. Moreover, the framework leverages four high-level forgery-sensitive local regions of a human face to guide the model to enhance subtle artifacts and localize potential anomalies. Extensive experiments on several benchmark datasets demonstrate the impressive performance of our method, and we achieve superiority over several state-of-the-art methods on cross-dataset generalization. We also examined the factors that contribute to its performance through ablations, which suggests that exploiting cross-domain local characteristics is a noteworthy direction for developing more general deepfake detectors.

translated by 谷歌翻译